320 research outputs found
Estimating the moments of a random vector with applications
A general result about the quality of approximation of the mean of a distribution by its empirical estimate is proven that does not involve the dimension of the feature space. Using the kernel trick this gives also bounds the quality of approximation of higher order moments. A number of applications are derived of interest in learning theory including a new novelty detection algorithm and rigorous bounds on the Robust Minimax Classification algorithm
Shortcuts to Artificial Intelligence
The current paradigm of Artiļ¬cial Intelligence emerged as the result of a series of cultural innovations, some technical and some social. Among them are apparently small design decisions, that led to a subtle reframing of the ļ¬eldās original goals, and are by now accepted as standard. They correspond to technical shortcuts, aimed at bypassing problems that were otherwise too complicated or too expensive to solve, while still delivering a viable version of AI. Far from being a series of separate problems, recent cases of unexpected eļ¬ects of AI are the consequences of those very choices that enabled the ļ¬eld to succeed, and this is why it will be diļ¬cult to solve them. In this chapter we review three of these choices, investigating their connection to some of todayās challenges in AI, including those relative to bias, value alignment, privacy and explainability. We introduce the notion of āethical debtā to describe the necessity to undertake expensive rework in the future in order to address ethical problems created by a technical system
Social Machinery and Intelligence
Social machines are systems formed by technical and human elements interacting in a
structured manner. The use of digital platforms as mediators allows large numbers of human participants to join such mechanisms, creating systems where interconnected digital and human components operate as a single machine capable of highly sophisticated behaviour. Under certain conditions, such systems can be described as autonomous and goal-driven agents. Many examples of modern Artificial Intelligence (AI) can be regarded as instances of this class of mechanisms. We argue that this type of autonomous social machines has provided a new paradigm for the design of intelligent systems marking a new phase in the field of AI. The consequences of this observation range from methodological, philosophical to ethical. On the one side, it emphasises the role of Human-Computer Interaction in the design of intelligent systems, while on the other side it draws attention to both the risks for a human being and those for a society relying on mechanisms that are not necessarily controllable. The difficulty by companies in regulating the spread of misinformation, as well as those by authorities to protect task-workers managed by a software infrastructure, could be just some of the effects of this technological paradigm
Biased Embeddings from Wild Data: Measuring, Understanding and Removing
Many modern Artificial Intelligence (AI) systems make use of data embeddings,
particularly in the domain of Natural Language Processing (NLP). These
embeddings are learnt from data that has been gathered "from the wild" and have
been found to contain unwanted biases. In this paper we make three
contributions towards measuring, understanding and removing this problem. We
present a rigorous way to measure some of these biases, based on the use of
word lists created for social psychology applications; we observe how gender
bias in occupations reflects actual gender bias in the same occupations in the
real world; and finally we demonstrate how a simple projection can
significantly reduce the effects of embedding bias. All this is part of an
ongoing effort to understand how trust can be built into AI systems.Comment: Author's original versio
History Playground: A Tool for Discovering Temporal Trends in Massive Textual Corpora
Recent studies have shown that macroscopic patterns of continuity and change
over the course of centuries can be detected through the analysis of time
series extracted from massive textual corpora. Similar data-driven approaches
have already revolutionised the natural sciences, and are widely believed to
hold similar potential for the humanities and social sciences, driven by the
mass-digitisation projects that are currently under way, and coupled with the
ever-increasing number of documents which are "born digital". As such, new
interactive tools are required to discover and extract macroscopic patterns
from these vast quantities of textual data. Here we present History Playground,
an interactive web-based tool for discovering trends in massive textual
corpora. The tool makes use of scalable algorithms to first extract trends from
textual corpora, before making them available for real-time search and
discovery, presenting users with an interface to explore the data. Included in
the tool are algorithms for standardization, regression, change-point detection
in the relative frequencies of ngrams, multi-term indices and comparison of
trends across different corpora
Machine Decisions and Human Consequences
As we increasingly delegate decision-making to algorithms, whether directly
or indirectly, important questions emerge in circumstances where those
decisions have direct consequences for individual rights and personal
opportunities, as well as for the collective good. A key problem for
policymakers is that the social implications of these new methods can only be
grasped if there is an adequate comprehension of their general technical
underpinnings. The discussion here focuses primarily on the case of enforcement
decisions in the criminal justice system, but draws on similar situations
emerging from other algorithms utilised in controlling access to opportunities,
to explain how machine learning works and, as a result, how decisions are made
by modern intelligent algorithms or 'classifiers'. It examines the key aspects
of the performance of classifiers, including how classifiers learn, the fact
that they operate on the basis of correlation rather than causation, and that
the term 'bias' in machine learning has a different meaning to common usage. An
example of a real world 'classifier', the Harm Assessment Risk Tool (HART), is
examined, through identification of its technical features: the classification
method, the training data and the test data, the features and the labels,
validation and performance measures. Four normative benchmarks are then
considered by reference to HART: (a) prediction accuracy (b) fairness and
equality before the law (c) transparency and accountability (d) informational
privacy and freedom of expression, in order to demonstrate how its technical
features have important normative dimensions that bear directly on the extent
to which the system can be regarded as a viable and legitimate support for, or
even alternative to, existing human decision-makers
- ā¦